Skip to content

backup: snapshot_reader (Phase 0a foundation for snapshot-decode binary)#792

Open
bootjp wants to merge 3 commits into
mainfrom
backup/snapshot-reader
Open

backup: snapshot_reader (Phase 0a foundation for snapshot-decode binary)#792
bootjp wants to merge 3 commits into
mainfrom
backup/snapshot-reader

Conversation

@bootjp
Copy link
Copy Markdown
Owner

@bootjp bootjp commented May 19, 2026

Summary

Adds internal/backup/snapshot_reader.go: the .fsm file parser
that underpins the cmd/elastickv-snapshot-decode binary (design
docs/design/2026_04_29_proposed_snapshot_logical_decoder.md lines
440-491). This is the foundational step toward Phase 0a's
operator-facing decoder; the dispatcher + cmd driver that consume
this reader come in a follow-up PR.

Mirrors the native Pebble snapshot format produced by
store/lsm_store.go::pebbleSnapshotMagic + restoreBatchLoopInto:

[8 bytes]   magic "EKVPBBL1"
[8 bytes]   lastCommitTS (LittleEndian uint64)
repeated:
  [8 bytes]   keyLen (LittleEndian uint64)
  [keyLen]    encoded key  = <userKey><invTS(8 BE)>
  [8 bytes]   valLen (LittleEndian uint64)
  [valLen]    encoded value = <flags(1)><expireAt(8 LE)><body>

Per-entry decoding strips:

  • The 8-byte inverted-TS suffix → recovers commit_ts via XOR
  • The 9-byte value header → surfaces tombstone, encryption_state, expireAt

API surface: ReadSnapshot(io.Reader, callback) + SnapshotHeader +
SnapshotEntry + sentinel errors. Callback-based so callers can
stream multi-GB snapshots without buffering; emitted byte slices
alias scratch buffers, so callers must bytes.Clone for retention.

Fail-closed contract (matches design §7.1)

  • ErrSnapshotBadMagic — first 8 bytes not "EKVPBBL1" (operator
    pointed the binary at an MVCC streaming snapshot, a tar, etc.)
  • ErrSnapshotTruncated — EOF mid-entry. A clean inter-entry EOF
    is the normal terminator and is NOT an error.
  • ErrSnapshotShortKey — encoded key < 8 bytes
  • ErrSnapshotShortValue — encoded value < 9 bytes
  • ErrSnapshotEncryptedReserved — reserved encryption_state
    bits (0b10, 0b11) or bits 3-7 set — fail-closed trip-wire
    parallel to store.ErrEncryptedValueReservedState
  • ErrSnapshotEncryptedEntryencState=0b01. Phase 0a does
    not link the decryption keyring; Stage 8 of the encryption
    rollout will add a keyring-aware variant.

Self-review (5 lenses)

  1. Data loss — tombstone flag surfaced verbatim (callers
    choose to suppress for backup output or include for
    diagnostic dumps). EOF distinguishes clean terminator from
    truncation.
  2. Concurrency / distributed — single-pass io.Reader; no
    shared state, no goroutines.
  3. Performance — per-entry slices alias a fixed 64 KiB key
    buffer + a growable value buffer. For a 10M-entry snapshot
    this keeps allocs O(1) outside the value-grow path. bufio
    amortises read syscalls.
  4. Data consistencyencryption_state is decoded into the
    identical 4-way switch (cleartext / encrypted / 0b10 /
    0b11) that store/lsm_store.go::decodeValue uses, so a
    future live-side enum extension fails closed here too.
  5. Test coverage — 17 table-driven tests in
    snapshot_reader_test.go:
    • header-only / single-entry round-trip
    • tombstone flag surfaced
    • multi-entry order preservation
    • bad magic / truncated header / truncated entry
    • short-key / short-value
    • encrypted entry rejection
    • all four reserved-encState cases
    • callback-error propagation
    • MVCC TS XOR-inversion across uint64 boundary
    • empty-value body

Caller audit (per /loop standing instruction)

This PR adds a NEW public API surface (ReadSnapshot,
SnapshotEntry, SnapshotHeader, PebbleSnapshotMagic, the
Err* sentinels). No prior callers exist. Verified via
grep -rn "ReadSnapshot|SnapshotEntry|SnapshotHeader"
across the tree — all matches are inside the two new files. The
follow-up PR (cmd driver + dispatcher) is this API's first consumer.

Test plan

  • go test -race ./internal/backup/ → ok
  • golangci-lint run ./internal/backup/... → 0 issues
  • go build ./... → ok
  • go vet ./internal/backup/... → ok

Summary by CodeRabbit

  • New Features
    • Added support for reading snapshots in Pebble native format with proper key-value pair extraction and MVCC metadata decoding (commit timestamps, expiry, tombstones).
    • Implemented validation and fail-safe mechanisms for snapshot integrity, including bounds checking, truncation detection, and encryption state validation.

Review Change Stack

Adds internal/backup/snapshot_reader.go: the .fsm file parser that
underpins the cmd/elastickv-snapshot-decode binary (design
docs/design/2026_04_29_proposed_snapshot_logical_decoder.md
lines 440-491). Mirrors the native Pebble snapshot format produced
by store/lsm_store.go::pebbleSnapshotMagic + restoreBatchLoopInto.

File shape consumed:

  [8 bytes]   magic "EKVPBBL1"
  [8 bytes]   lastCommitTS (LittleEndian uint64)
  repeated:
    [8 bytes]   keyLen (LittleEndian uint64)
    [keyLen]    encoded key  = <userKey><invTS(8 BE)>
    [8 bytes]   valLen (LittleEndian uint64)
    [valLen]    encoded value = <flags(1)><expireAt(8 LE)><body>

Per-entry decoding strips:
- The 8-byte inverted-TS suffix from the key (recovers commit_ts
  via XOR-inversion across the math.MaxUint64 boundary).
- The 9-byte value header from the value (surfaces tombstone flag,
  encryption_state, expireAt).

API surface (ReadSnapshot + SnapshotHeader + SnapshotEntry +
sentinel errors). Callback-based so callers can stream
multi-GB snapshots without buffering the whole file in memory; the
emitted SnapshotEntry's byte slices alias scratch buffers and the
caller must bytes.Clone for retention.

Fail-closed contract (matches design 7.1):
- ErrSnapshotBadMagic    - first 8 bytes not "EKVPBBL1" (operator
                           pointed at the wrong file)
- ErrSnapshotTruncated   - EOF mid-entry (a clean inter-entry EOF
                           is the normal terminator and is NOT
                           an error)
- ErrSnapshotShortKey    - encoded key < 8 bytes (no room for TS)
- ErrSnapshotShortValue  - encoded value < 9 bytes (no room for
                           value header)
- ErrSnapshotEncryptedReserved - reserved encryption_state bits
                           (0b10, 0b11) or bits 3-7 set; same
                           fail-closed trip-wire as
                           store.ErrEncryptedValueReservedState
- ErrSnapshotEncryptedEntry - encState=0b01 (encrypted).
                           Phase 0a does not link the decryption
                           keyring; Stage 8 of the encryption
                           rollout adds a keyring-aware variant.

Self-review:
1. Data loss   - tombstone flag surfaced verbatim so callers can
   choose to suppress (default Phase 0a backup behavior) or
   include (a future diagnostic dump might want both). EOF
   distinction between clean terminator and truncation prevents
   silently dropping a partial entry.
2. Concurrency - reader is single-pass over an io.Reader; no
   shared state, no goroutines.
3. Performance - per-entry slices alias a fixed 64 KiB key buffer
   and a growable value buffer. For a 10M-entry snapshot this
   keeps allocs O(1) outside the value-grow boundary. bufio
   wrapper amortises read syscalls.
4. Consistency - encryption_state bits are decoded into the
   identical 4-way switch (cleartext / encrypted / 0b10 / 0b11)
   that store/lsm_store.go::decodeValue uses, so a future
   live-side enum extension fails closed here too.
5. Coverage - 17 table-driven tests in snapshot_reader_test.go
   covering: header-only, single-entry round-trip, tombstone
   flag, multi-entry order preservation, bad magic, truncated
   header, truncated entry, short-key, short-value, encrypted
   entry, all four reserved-encState cases, callback-error
   propagation, MVCC TS XOR-inversion across uint64 boundary,
   empty-value body.

Caller audit (per /loop standing instruction): this PR adds a
NEW public API surface (ReadSnapshot, SnapshotEntry,
SnapshotHeader, PebbleSnapshotMagic, the Err* sentinels). No
prior callers exist. Verified via
'grep -rn "ReadSnapshot|SnapshotEntry|SnapshotHeader" --include=*.go'
- all matches are inside internal/backup/snapshot_reader{,test}.go.
The follow-up PR (cmd/elastickv-snapshot-decode driver +
dispatcher) will be this API's first consumer.
@bootjp
Copy link
Copy Markdown
Owner Author

bootjp commented May 19, 2026

@claude review

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 19, 2026

Warning

Rate limit exceeded

@bootjp has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 52 minutes and 43 seconds before requesting another review.

You’ve run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: db6eb891-553d-42f2-b1e2-ce12b84c6f6f

📥 Commits

Reviewing files that changed from the base of the PR and between a86e6ba and 8e6e40e.

📒 Files selected for processing (2)
  • internal/backup/snapshot_reader.go
  • internal/backup/snapshot_reader_test.go
📝 Walkthrough

Walkthrough

This PR adds a new Pebble snapshot reader that decodes the native EKVPBBL1 format, extracting user keys and values alongside MVCC metadata (commit timestamp, expiry, tombstone), with guarded I/O and encryption rejection, tested comprehensively across format variants and error cases.

Changes

Snapshot Reader Implementation

Layer / File(s) Summary
Format definition and data contracts
internal/backup/snapshot_reader.go (1–178)
Snapshot format constants (PebbleSnapshotMagic, MaxSnapshotEncodedKeySize, MaxSnapshotEncodedValueSize), exported error variables for validation (bad magic, truncation, encryption rejection, length violations), and data structures for SnapshotHeader (decoded LastCommitTS) and SnapshotEntry (user key/value, commit timestamp, expiry, tombstone flag).
Reader implementation and orchestration
internal/backup/snapshot_reader.go (179–425)
ReadSnapshot entry point iterates through snapshot entries, invoking a callback for each; supporting functions read and validate length prefixes (readEntryLen), decode headers (readSnapshotHeader), read encoded keys/values with bounds checks (readEntryKey, readEntryValue), handle buffered I/O with EOF translation (readExact, readExactGrow), and decode MVCC metadata while rejecting encryption and reserved bits (decodeSnapshotEntry).
Core behavior and round-trip tests
internal/backup/snapshot_reader_test.go (1–172)
Test infrastructure (snapBuilder) constructs valid snapshot byte streams; tests validate header-only snapshots yield no entries, single entries round-trip user keys/values with correct MVCC metadata, tombstone bits surface correctly, and multi-entry ordering is preserved on-disk.
Error handling and truncation tests
internal/backup/snapshot_reader_test.go (174–466)
Tests validate fail-closed behavior for bad magic, truncated headers, truncated entries at various boundaries (key-length, key bytes, value-length), and that callback errors propagate immediately without further iteration.
Validation, encryption, and edge case tests
internal/backup/snapshot_reader_test.go (253–514)
Tests enforce encoding-size budgets, validate boundary acceptance, reject short key/value encodings, reject encrypted entries and reserved encryption-state bits, verify commit-timestamp inversion round-trips across math.MaxUint64 boundaries, and confirm empty-value bodies decode without errors; includes an io.EOF sanity check.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐰 A snapshot reader hops through bytes,
Decoding Pebble formats with care,
MVCC timestamps, tombstones in sight,
Encryption checks—bounds held fair,
From header to entry, a fluffy prepare! 🎉

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly identifies the main change: a snapshot reader implementation for the backup package, establishing foundational infrastructure for a snapshot-decode binary.
Docstring Coverage ✅ Passed Docstring coverage is 96.43% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch backup/snapshot-reader

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@claude
Copy link
Copy Markdown

claude Bot commented May 19, 2026

Claude finished @bootjp's task in 2m 53s —— View job


Code Review — internal/backup/snapshot_reader.go

Overall: This is clean, well-documented code and the format constants are correctly mirrored from store/lsm_store.go. I found one real bug (wrong sentinel returned on a specific truncation path), one robustness gap (unbounded allocation), and a few smaller issues.


  • Read snapshot_reader.go
  • Read snapshot_reader_test.go
  • Cross-reference with store/lsm_store.go (format source of truth)
  • Review: correctness, safety, performance, consistency
  • Post findings

Bug: discarded eof bool in readOneEntry (lines 198–200)

// current code
vLen, _, err := readEntryLen(r)
if err != nil {
    if cockroachdberr.Is(err, io.EOF) { // ← dead code
        return false, cockroachdberr.WithStack(ErrSnapshotTruncated)
    }
    return false, err
}

readEntryLen never returns io.EOF as an error — it converts a clean EOF at the start of an 8-byte field to (0, true, nil). By discarding the eof bool with _, the clean-EOF case (snapshot ends exactly at the value-length field) falls through with vLen=0. Then readExactGrow succeeds reading 0 bytes, and decodeSnapshotEntry(key, []byte{}) returns ErrSnapshotShortValue instead of the correct ErrSnapshotTruncated.

Fix:

vLen, vEOF, err := readEntryLen(r)
if err != nil {
    return false, err
}
if vEOF {
    return false, cockroachdberr.WithStack(ErrSnapshotTruncated)
}

Missing test: TestReadSnapshot_RejectsTruncatedEntry covers truncation at the key-length field, but there is no test for a snapshot that ends cleanly right after the key bytes (i.e., at the value-length field boundary). This specific case would reproduce the bug above.

Fix this →


Robustness: No length bounds check before make([]byte, n)

readExactGrow (line 276) calls make([]byte, n) where n is a raw uint64 from the file. A corrupt snapshot with kLen = 0xFFFFFFFFFFFFFFFF causes OOM or a runtime panic.

The live-side restoreFieldLenInt in lsm_store.go validates against maxPebbleEncodedKeySize and maxSnapshotValueSize before making any allocation. The offline reader needs similar guards:

const (
    maxSnapshotKeyLen   = maxSnapshotKeySize + snapshotTSSize    // mirror live constant
    maxSnapshotValueLen = maxSnapshotValueSize + snapshotValueHeaderSize + encEnvelopeOverhead
)

func readExactGrow(r io.Reader, dst []byte, n uint64, maxN uint64) ([]byte, error) {
    if n > maxN {
        return nil, cockroachdberr.Newf("backup: field length %d exceeds max %d", n, maxN)
    }
    // ...
}

For a Phase 0a offline tool the blast radius is process-local, but a panicking decoder is worse than a clean error. Fix this →


Naming: typo in snapshotEncStateCleartx

snapshotEncStateCleartx (tx suffix) vs encStateCleartext in the live store. The truncated suffix doesn't carry its weight — Cleartext is the obvious word. The test file uses this constant directly, so a future reader would need to mentally map the typo.


Minor: double WithStack on key-read truncation

readExact already wraps ErrSnapshotTruncated with WithStack. Then readOneEntry line 189 wraps again:

key, err := readExact(r, keyScratch[:0], kLen)
if err != nil {
    return false, cockroachdberr.WithStack(err) // ← redundant
}

The redundant WithStack adds a spurious frame pointing to the caller rather than the actual error site. Propagate err directly.


API design note: SnapshotHeader not surfaced for empty snapshots

SnapshotHeader is only delivered via the per-entry callback, so for a snapshot with zero entries the caller never receives LastCommitTS. The PR description and test both acknowledge this, but the design note at lines 440-491 says the header should be recorded in MANIFEST.json — if the backup library sees an empty shard snapshot it silently drops the commit-horizon. Consider returning SnapshotHeader as a direct return value from ReadSnapshot:

func ReadSnapshot(r io.Reader, fn func(SnapshotEntry) error) (SnapshotHeader, error)

Format constants — all correct ✓

Constant store/lsm_store.go snapshot_reader.go
timestamp suffix size timestampSize = 8 snapshotTSSize = 8
value header size valueHeaderSize = 9 snapshotValueHeaderSize = 9
tombstone mask 0b0000_0001 0b0000_0001
encState mask/shift 0b0000_0110 / 1 0b0000_0110 / 1
reserved high-bit mask 0b1111_1000 0b1111_1000
magic bytes {'E','K','V','P','B','B','L','1'} identical ✓

Key encoding (userKey || BigEndian(^ts)) and value encoding (flags(1) || expireAt(8 LE) || body) match fillEncodedKey / fillEncodedValue exactly.


Summary

Severity Issue
Bug Discarded eof bool → wrong sentinel (ErrSnapshotShortValue vs ErrSnapshotTruncated) + dead io.EOF branch + missing test
Robustness Unbounded allocation in readExactGrow on corrupt length field
Minor Typo snapshotEncStateCleartxsnapshotEncStateCleartext
Minor Double WithStack on key-read error path
Design SnapshotHeader invisible for empty snapshots

The format compatibility and the fail-closed encryption/reserved-state logic are solid. The bug above is the only one that changes which sentinel callers see, and it only manifests on a specific truncation pattern not covered by the existing tests.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 42fb7ba3db

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread internal/backup/snapshot_reader.go Outdated
if err != nil {
return false, cockroachdberr.WithStack(err)
}
vLen, _, err := readEntryLen(r)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Return truncation when value length is missing

When a file ends immediately after the key bytes, readEntryLen returns (0, true, nil), but this call discards the eof flag and proceeds as if the entry had a zero-length value. That makes this mid-entry truncation surface as ErrSnapshotShortValue rather than the advertised ErrSnapshotTruncated, so callers cannot reliably distinguish a cut-off snapshot from a malformed encoded value. Please check the eof result here and return ErrSnapshotTruncated for this case.

Useful? React with 👍 / 👎.

Comment on lines +277 to +278
if uint64(cap(dst)) < n {
dst = make([]byte, n)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Bound snapshot length prefixes before allocating

For a corrupt or adversarial snapshot with a huge key/value length prefix, this path allocates make([]byte, n) directly from the untrusted uint64 length. That can panic with len out of range or OOM the operator decoder instead of failing closed; the live restore path guards the same fields with maxPebbleEncodedKeySize / maxSnapshotValueSize before allocation. Please validate kLen and vLen against equivalent bounds before calling this helper.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a native Pebble snapshot reader and associated tests to decode entries while stripping MVCC metadata. The reviewer identified several critical issues: a lack of bounds checking on key and value lengths that could lead to resource exhaustion or panics, improper handling of truncated snapshots when reading value lengths, and an inefficient use of reflection in the header parsing logic.

Comment thread internal/backup/snapshot_reader.go Outdated
valBuf *[]byte,
fn func(SnapshotHeader, SnapshotEntry) error,
) (bool, error) {
kLen, eof, err := readEntryLen(r)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-high high

The reader lacks bounds checking for kLen and vLen. Since these lengths are read directly from the input stream as uint64, a malformed or malicious snapshot file could specify extremely large lengths, leading to massive memory allocations and potential Out-Of-Memory (OOM) crashes in readExact and readExactGrow. Additionally, on 32-bit systems, converting a large uint64 to a slice index or length (which are int) will cause a panic.

It is recommended to enforce maximum size limits for keys and values, similar to the limits used in store/lsm_store.go (e.g., maxPebbleEncodedKeySize and maxSnapshotValueSize).

References
  1. When deserializing data with a size header, pre-allocating a buffer is acceptable if a strictly enforced maximum size limit is checked before allocation and the entire data must be materialized in memory.

Comment thread internal/backup/snapshot_reader.go Outdated
Comment on lines +191 to +200
vLen, _, err := readEntryLen(r)
if err != nil {
// A clean EOF here means the snapshot truncated between
// the key bytes and the value-length field — not the
// same as a clean inter-entry EOF.
if cockroachdberr.Is(err, io.EOF) {
return false, cockroachdberr.WithStack(ErrSnapshotTruncated)
}
return false, err
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The readOneEntry function ignores the eof return value from readEntryLen when reading the value length (vLen). If the snapshot is truncated exactly before the vLen field, readEntryLen returns (0, true, nil). In this case, err is nil, and the code proceeds to call readExactGrow with vLen=0, which succeeds. Subsequently, decodeSnapshotEntry will fail with ErrSnapshotShortValue because the value header is missing. This should instead return ErrSnapshotTruncated to accurately reflect the state of the stream.

Additionally, the check cockroachdberr.Is(err, io.EOF) is redundant because readEntryLen already handles io.EOF and io.ErrUnexpectedEOF, returning either (0, true, nil) or ErrSnapshotTruncated.

	vLen, vEof, err := readEntryLen(r)
	if err != nil {
		return false, err
	}
	if vEof {
		return false, cockroachdberr.WithStack(ErrSnapshotTruncated)
	}

Comment on lines +226 to +229
var ts uint64
if err := binary.Read(r, binary.LittleEndian, &ts); err != nil {
return SnapshotHeader{}, cockroachdberr.WithStack(err)
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Using binary.Read with a pointer to a scalar value is inefficient in Go because it uses reflection. For performance-sensitive code like a snapshot reader, it is better to read into a small byte array and use binary.LittleEndian.Uint64 directly, as done in readEntryLen.

Suggested change
var ts uint64
if err := binary.Read(r, binary.LittleEndian, &ts); err != nil {
return SnapshotHeader{}, cockroachdberr.WithStack(err)
}
var tsBuf [8]byte
if _, err := io.ReadFull(r, tsBuf[:]); err != nil {
return SnapshotHeader{}, cockroachdberr.WithStack(err)
}
ts := binary.LittleEndian.Uint64(tsBuf[:])

…before-vLen

Three P1+ findings from the first review round:

1. codex P1 (line 278) "Bound snapshot length prefixes before
   allocating"
2. gemini security-high + high (line 180) "Reader lacks bounds
   checking for kLen and vLen — adversarial snapshot can OOM the
   decoder or panic on 32-bit narrowing"
3. gemini high (line 200) "readOneEntry ignores the eof return value
   from readEntryLen on the value-length read — a truncated stream
   surfaces as the wrong error class (ErrSnapshotShortValue instead
   of ErrSnapshotTruncated)"

Findings 1+2 share a root cause: ReadSnapshot trusted the on-disk
length prefix and called make([]byte, n) directly. The live restore
path applies maxPebbleEncodedKeySize / maxSnapshotValueSize guards
before allocation (store/lsm_store.go::readRestoreFieldLen). The
backup-side reader was missing the equivalent. A corrupt or
adversarial snapshot whose length prefix declares 4 GB would either
OOM the operator decoder or, on 32-bit architectures, panic when
narrowing uint64 -> int for the slice length.

Fix: add MaxSnapshotEncodedKeySize (1 MiB + 8) and
MaxSnapshotEncodedValueSize (256 MiB + 9 + 34) bounds, check
before each allocation, fail closed with new sentinel errors
ErrSnapshotKeyTooLarge / ErrSnapshotValueTooLarge. Mirrors the
live store's constants; duplicated here so the backup package
keeps the offline-tool adapter-independence required by the design.

Finding 3 fix: refactor readOneEntry into readEntryKey +
readEntryValue helpers. readEntryKey distinguishes between
"natural inter-entry EOF" (kEof=true at the very first read of an
entry) and a truncated tail mid-entry (any other point). readEntryValue
treats any EOF as truncation — there is no valid clean-EOF state
after the key has been consumed. This also drops the readOneEntry
function under the cyclop budget.

Caller audit (per /loop standing instruction): the semantic change
in question is the error class returned when the snapshot truncates
between key and value-length. Previously: ErrSnapshotShortValue.
Now: ErrSnapshotTruncated. Callers of ReadSnapshot:

  grep -rn 'ReadSnapshot|SnapshotEntry|SnapshotHeader' --include=*.go
  -> all matches are inside internal/backup/snapshot_reader{,_test}.go

The dispatcher / cmd driver that will consume this API (Phase 0a
follow-up, not yet built) will see ErrSnapshotTruncated for any
mid-entry truncation regardless of which length-prefix the
truncation occurred before — a more correct semantic that matches
the docstring contract. No existing caller exists, so no caller
audit gap.

New tests:
- TestReadSnapshot_RejectsTruncatedBeforeValueLength pins gemini
  high finding (key-then-no-vLen -> ErrSnapshotTruncated)
- TestReadSnapshot_RejectsKeyLengthOverBudget pins codex P1 (key
  length-prefix > MaxSnapshotEncodedKeySize fails closed BEFORE
  allocation)
- TestReadSnapshot_RejectsValueLengthOverBudget pins the same for
  value side
- TestReadSnapshot_AcceptsKeyLengthAtBudgetBoundary pins off-by-one
  (length == budget is accepted; only > budget rejected)

Self-review:
1. Data loss - the bound rejects only over-budget entries the live
   store would never have written; legitimate entries up to the
   limits are still accepted (boundary test pins this).
2. Concurrency - no shared state; reader is single-pass over
   io.Reader as before.
3. Performance - one extra comparison per entry, far cheaper than
   the prevented OOM allocation.
4. Consistency - matches the live restore path's bounds and error
   shape. Mid-entry EOF now consistently surfaces as
   ErrSnapshotTruncated regardless of where in the entry the
   truncation occurred.
5. Coverage - 4 new tests in addition to the 17 existing ones.
   All pass with -race.
@bootjp
Copy link
Copy Markdown
Owner Author

bootjp commented May 19, 2026

@claude review

@claude
Copy link
Copy Markdown

claude Bot commented May 19, 2026

Claude finished @bootjp's task in 3m 16s —— View job


Code Review — internal/backup/snapshot_reader.go (r1 revision)

  • Read current file state (post-r1 fixes)
  • Cross-reference with store/lsm_store.go
  • Review: correctness, safety, performance, consistency
  • Post findings

What r1 fixed ✓

The two blocking issues from the previous round are fully resolved:

Issue Fix
Discarded eof bool → wrong sentinel readOneEntry split into readEntryKey + readEntryValue; readEntryValue correctly checks the bool and returns ErrSnapshotTruncated
Unbounded make([]byte, n) on corrupt length prefix MaxSnapshotEncodedKeySize / MaxSnapshotEncodedValueSize constants added and checked before every allocation in both helpers
Missing test (truncation at value-length boundary) TestReadSnapshot_RejectsTruncatedBeforeValueLength added
Missing tests for over/at-budget lengths TestReadSnapshot_RejectsKeyLengthOverBudget, TestReadSnapshot_RejectsValueLengthOverBudget, TestReadSnapshot_AcceptsKeyLengthAtBudgetBoundary all added

Format constants cross-checked against store/lsm_store.go — all correct (sizes, masks, shifts, magic, value-header layout).


Remaining issues

Minor: typo snapshotEncStateCleartx (line 61)

The live constant is encStateCleartext; the backup reader has snapshotEncStateCleartx (the ext suffix was dropped). The tx ending makes it look like a transaction-related constant. Since the test file imports this directly, a future reader comparing both files would need to mentally map the truncation.

Fix this →


Minor: double WithStack on read-error paths (lines 272, 300)

Both readExact and readExactGrow already call cockroachdberr.WithStack on every returned error. readEntryKey:272 and readEntryValue:300 then wrap the result again, adding a spurious stack frame pointing to the caller rather than the actual error site:

// readEntryKey line 272
key, err := readExact(r, scratch[:0], kLen)
if err != nil {
    return nil, false, cockroachdberr.WithStack(err) // ← readExact already called WithStack
}
// readEntryValue line 300
*valBuf, err = readExactGrow(r, (*valBuf)[:0], vLen)
if err != nil {
    return nil, cockroachdberr.WithStack(err) // ← readExactGrow already called WithStack
}

Propagate err directly in both places. Fix this →


Minor: binary.Read with reflection in readSnapshotHeader (line 317)

readSnapshotHeader uses binary.Read(r, binary.LittleEndian, &ts) for the 8-byte lastCommitTS field. All other 8-byte reads in this file use the zero-allocation path ([8]byte + binary.LittleEndian.Uint64) that Gemini flagged in the prior round. This is the only outlier:

// current
var ts uint64
if err := binary.Read(r, binary.LittleEndian, &ts); err != nil { ... }

// consistent with readEntryLen
var tsBuf [8]byte
if _, err := io.ReadFull(r, tsBuf[:]); err != nil {
    return SnapshotHeader{}, cockroachdberr.WithStack(err)
}
ts := binary.LittleEndian.Uint64(tsBuf[:])

Fix this →


Design (unchanged): SnapshotHeader not surfaced for zero-entry snapshots

For a snapshot with no entries fn is never called, so the caller never sees LastCommitTS. The test acknowledges this (_ = sawHeader) but the limitation remains: if a shard snapshot is empty, the backup driver can't record the commit-horizon in MANIFEST.json.

The cleanest fix is to return the header as a direct return value (parallel to how io.ReadCloser returns both the reader and an error):

func ReadSnapshot(r io.Reader, fn func(SnapshotEntry) error) (SnapshotHeader, error)

This changes the public API, which warrants a decision before the follow-up PR (cmd driver) lands — once the driver consumes the current signature, changing it becomes a two-PR refactor. The callback already receives the header per-entry, so the driver could alternatively be written to cache it from the first call. Either approach is acceptable; the goal is just to land a decision before the consumer ships.


Nit: misleading nolint comment in test (line 154)

b.WriteEntry([]byte(k), uint64(i+1), []byte{byte(i)}, false, 0, snapshotEncStateCleartx) //nolint:gosec // i bounded by len(keys)<<63

len(keys) << 63 is an astronomically large number — the comment probably meant i < len(keys) ≤ 3 < 256 (so byte(i) can't overflow). Fix this →


Summary

Severity Issue Status
Bug eof bool discarded → wrong sentinel + dead io.EOF branch ✅ Fixed in r1
Robustness Unbounded allocation on corrupt length prefix ✅ Fixed in r1
Minor Typo snapshotEncStateCleartxsnapshotEncStateCleartext Still open
Minor Double WithStack on read-error paths Still open
Minor binary.Read with reflection in header parse Still open
Design SnapshotHeader invisible for zero-entry snapshots Still open (decision needed before consumer PR)
Nit Misleading nolint comment in test Still open

The two critical issues are resolved. The remaining items are all minor; the design question about SnapshotHeader is the only one worth settling before the cmd-driver follow-up lands.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
internal/backup/snapshot_reader_test.go (1)

152-155: ⚡ Quick win

Drop //nolint:gosec here by removing the narrowing cast path.

You can avoid the suppression with prebuilt byte payloads and keep intent explicit.

Proposed fix
 	keys := []string{"!redis|str|a", "!redis|str|b", "!redis|str|c"}
+	bodies := [][]byte{{0}, {1}, {2}}
 	for i, k := range keys {
-		b.WriteEntry([]byte(k), uint64(i+1), []byte{byte(i)}, false, 0, snapshotEncStateCleartx) //nolint:gosec // i bounded by len(keys)<<63
+		b.WriteEntry([]byte(k), uint64(i+1), bodies[i], false, 0, snapshotEncStateCleartx)
 	}

As per coding guidelines, "**/*.go: Use gofmt formatting and pass linters... Avoid //nolint annotations — refactor instead."

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@internal/backup/snapshot_reader_test.go` around lines 152 - 155, The test
uses a narrowing cast with []byte{byte(i)} and suppresses gosec; instead
prebuild explicit byte payloads and use them in the loop so the nolint can be
removed: create a slice of byte-slices (e.g., payloads := [][]byte{{0}, {1},
{2}}) and pass payloads[i] into b.WriteEntry (the call to b.WriteEntry in
snapshot_reader_test.go where keys is iterated) rather than doing
[]byte{byte(i)} and delete the //nolint:gosec comment.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@internal/backup/snapshot_reader_test.go`:
- Around line 301-329: The test
TestReadSnapshot_AcceptsKeyLengthAtBudgetBoundary claims to exercise the
boundary at MaxSnapshotEncodedKeySize but currently uses a hard-coded 1 KiB user
key; update the test to build a userKey whose length makes the total encoded key
equal to MaxSnapshotEncodedKeySize (i.e., set userKeyLen =
MaxSnapshotEncodedKeySize - 8 to account for the 8-byte timestamp suffix used by
the snapshot encoding) so the test actually verifies keyLen ==
MaxSnapshotEncodedKeySize; adjust the userKey construction and leave the
ReadSnapshot callback and assertions unchanged (references:
TestReadSnapshot_AcceptsKeyLengthAtBudgetBoundary, MaxSnapshotEncodedKeySize,
newSnapBuilder, ReadSnapshot).

In `@internal/backup/snapshot_reader.go`:
- Around line 102-105: PebbleSnapshotMagic is exported as a mutable var which
allows callers to mutate the magic header; replace it with an immutable constant
(e.g., a const string) or an unexported const byte sequence and update all
comparisons to use the immutable value (convert to []byte or compare strings as
needed). Specifically, change the declaration of PebbleSnapshotMagic (and keep
PebbleSnapshotMagicLen semantics) so it is a const, then update any
readers/sniffers that currently compare against the var to use the new const (or
[]byte(constVal)) to ensure callers cannot mutate the magic bytes.

---

Nitpick comments:
In `@internal/backup/snapshot_reader_test.go`:
- Around line 152-155: The test uses a narrowing cast with []byte{byte(i)} and
suppresses gosec; instead prebuild explicit byte payloads and use them in the
loop so the nolint can be removed: create a slice of byte-slices (e.g., payloads
:= [][]byte{{0}, {1}, {2}}) and pass payloads[i] into b.WriteEntry (the call to
b.WriteEntry in snapshot_reader_test.go where keys is iterated) rather than
doing []byte{byte(i)} and delete the //nolint:gosec comment.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 706b1a28-cf25-469d-b45e-ab3f8fc06ffd

📥 Commits

Reviewing files that changed from the base of the PR and between 8c0d07f and a86e6ba.

📒 Files selected for processing (2)
  • internal/backup/snapshot_reader.go
  • internal/backup/snapshot_reader_test.go

Comment thread internal/backup/snapshot_reader_test.go
Comment thread internal/backup/snapshot_reader.go Outdated
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: a86e6ba07a

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +231 to +234
if eof {
// Clean inter-entry EOF — natural terminator.
return true, nil
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Expose snapshot header when entry stream is empty

A valid snapshot can contain just the header (EKVPBBL1 + lastCommitTS) and zero entries, but this path returns immediately on inter-entry EOF before invoking fn, so callers can never observe SnapshotHeader.LastCommitTS for empty snapshots. That breaks the stated contract that the header is surfaced to callers and prevents downstream tooling from recording last_commit_ts in manifests for empty-database backups.

Useful? React with 👍 / 👎.

…ic + boundary test

coderabbit raised 2 Major findings on round 1:

1. line 105 "Avoid exporting mutable magic bytes as a package
   variable" - PebbleSnapshotMagic was a `var [8]byte`, so any
   importer could mutate it and break parsing globally
   (`backup.PebbleSnapshotMagic[0] = 0xff` is legal Go).
2. line 329 "Boundary test name/intent doesn't match what it
   verifies" - TestReadSnapshot_AcceptsKeyLengthAtBudgetBoundary
   used a 1 KiB key, which is nowhere near the 1 MiB+8
   MaxSnapshotEncodedKeySize budget. A regression flipping `>` to
   `>=` would not be caught by the existing test.

Fixes:

1. Change `PebbleSnapshotMagic` from a `var [PebbleSnapshotMagicLen]byte`
   to an untyped `const string` carrying the same bytes. Strings are
   immutable in Go, so this closes the package-variable mutation
   surface. Update the only consumers (readSnapshotHeader's bytes
   comparison and the test snapBuilder) to use the new type.

2. Resize the boundary test's user key so the ENCODED key
   (userKey + 8-byte TS suffix) lands at EXACTLY
   MaxSnapshotEncodedKeySize, then assert successful round-trip.
   A `>=` vs `>` regression now surfaces.

Caller audit (per /loop standing instruction): the type change to
PebbleSnapshotMagic from `[8]byte` to `string` is a public API
shape change. PR #792 introduces this symbol; no external caller
exists yet (`grep -rn PebbleSnapshotMagic --include=*.go` shows
matches only inside the two files in this PR). Caller audit clean.

Self-review:
1. Data loss - none. The byte content of the magic is unchanged.
2. Concurrency - the previous shape was technically race-prone if
   importers wrote concurrently. The string const cannot race.
3. Performance - one tiny allocation per call (`[]byte(string)` for
   the comparison). Bounded; called once per snapshot file open.
4. Consistency - immutable-magic pattern matches the live store's
   `pebbleSnapshotMagic` (which is package-private, also immutable
   in practice). The new boundary test now actually exercises the
   budget threshold.
5. Coverage - existing 21 snapshot tests still pass with the new
   type. The boundary test is now a real off-by-one guard
   (allocates 1 MiB instead of 1 KiB during the test — still
   well within go test memory budget).
@bootjp
Copy link
Copy Markdown
Owner Author

bootjp commented May 19, 2026

@claude review

@claude
Copy link
Copy Markdown

claude Bot commented May 19, 2026

Claude finished @bootjp's task in 1m 55s —— View job


Code Review — r2 revision

  • Read current file state (post-r2 fixes)
  • Cross-reference with store/lsm_store.go
  • Review: correctness, safety, performance, consistency
  • Post findings

What r2 fixed ✓

Issue Fix
Mutable var [8]byte magic → any importer could corrupt it globally PebbleSnapshotMagic is now a const string
TestReadSnapshot_AcceptsKeyLengthAtBudgetBoundary used a hard-coded 1 KiB key, so a >= vs > regression would go undetected Test now sizes userKey to MaxSnapshotEncodedKeySize - snapshotTSSize so encoded key == budget exactly

Both CodeRabbit Major items from round 2 are resolved. Format constants still match store/lsm_store.go exactly (sizes, masks, shifts, magic, value-header layout).


Remaining issues

Minor: typo snapshotEncStateCleartx (line 61)

The live constant is encStateCleartext (store/lsm_store.go:35). The backup reader has snapshotEncStateCleartx — the ext suffix was dropped, leaving tx which reads like a transaction constant. It appears in both the switch arm (decodeSnapshotEntry:414) and every test invocation that passes the cleartext state. Fix this →


Minor: double WithStack on read-error paths (lines 279, 307)

readExact and readExactGrow both call cockroachdberr.WithStack on every error they return (lines 363–365, 381–383). The callers then wrap again:

// readEntryKey line 279
key, err := readExact(r, scratch[:0], kLen)
if err != nil {
    return nil, false, cockroachdberr.WithStack(err) // readExact already wrapped
}

// readEntryValue line 307
*valBuf, err = readExactGrow(r, (*valBuf)[:0], vLen)
if err != nil {
    return nil, cockroachdberr.WithStack(err) // readExactGrow already wrapped
}

Propagate err directly in both places. Fix this →


Minor: binary.Read with reflection in readSnapshotHeader (line 324)

var ts uint64
if err := binary.Read(r, binary.LittleEndian, &ts); err != nil { ... }

This is the only binary.Read in the file; every other 8-byte field uses the zero-reflection path ([8]byte + io.ReadFull + binary.LittleEndian.Uint64). Fix this →


Minor: magic comparison allocates (line 319)

if !bytes.Equal(magic[:], []byte(PebbleSnapshotMagic)) { ... }

PebbleSnapshotMagic is now a const string; converting it to []byte allocates a fresh slice on each call. Use string comparison instead:

if string(magic[:]) != PebbleSnapshotMagic { ... }

Design (unchanged): SnapshotHeader invisible for zero-entry snapshots

For a snapshot containing only the header, fn is never called, so the caller never observes LastCommitTS. The test explicitly acknowledges this (_ = sawHeader). For the backup-manifest use case (recording the commit horizon in MANIFEST.json for every shard, including empty ones), this is a gap.

The cleanest resolution before the cmd-driver follow-up lands — while the API is still caller-free — is to surface the header as a direct return value:

func ReadSnapshot(r io.Reader, fn func(SnapshotEntry) error) (SnapshotHeader, error)

This is worth deciding before the consumer PR ships; changing the signature after the driver consumes the current form becomes a two-PR refactor.


Nit: misleading nolint comment in test (line 154)

b.WriteEntry([]byte(k), uint64(i+1), []byte{byte(i)}, false, 0, snapshotEncStateCleartx) //nolint:gosec // i bounded by len(keys)<<63

len(keys) << 63 is 3 << 63 = 2.76e19 — an astronomically large number that doesn't explain why byte(i) can't overflow. The actual bound is i < len(keys) == 3 < 256. Per CLAUDE.md conventions, avoid //nolint by refactoring: use a pre-built bodies := [][]byte{{0}, {1}, {2}} slice and index into it. Fix this →


Summary

Severity Issue Status
Bug eof bool discarded → wrong sentinel ✅ Fixed r1
Robustness Unbounded allocation on corrupt length prefix ✅ Fixed r1
Major PebbleSnapshotMagic mutable package var ✅ Fixed r2
Major Boundary test used 1 KiB key, not actual budget ✅ Fixed r2
Minor Typo snapshotEncStateCleartx Still open
Minor Double WithStack on read-error paths Still open
Minor binary.Read reflection in header parse Still open
Minor Magic comparison allocates []byte(const) Still open
Design SnapshotHeader invisible for zero-entry snapshots Still open — decide before consumer PR lands
Nit Misleading / wrong //nolint:gosec comment Still open

All blocking issues are resolved. The remaining items are minor; the design question about SnapshotHeader is the only one time-sensitive (before the cmd-driver follow-up ships).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant